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Abstract 

In this paper, we consider the harmonic extension problem, which is widely 
used in many applications of machine learning. We find that the transitional method 
of graph Laplacian fails to produce a good approximation of the classical harmonic 
function. To tackle this problem, we propose a new method called the point inte¬ 
gral method (PIM). We consider the harmonic extension problem from the point 
of view of solving PDEs on manifolds. The basic idea of the PIM method is to 
approximate the harmonicity using an integral equation, which is easy to be dis¬ 
cretized from points. Based on the integral equation, we explain the reason why the 
transitional graph Laplacian may fail to approximate the harmonicity in the clas¬ 
sical sense and propose a different approach which we call the volume constraint 
method (VCM). Theoretically, both the PIM and the VCM computes a harmonic 
function with convergence guarantees, and practically, they are both simple, which 
amount to solve a linear system. One important application of the harmonic exten¬ 
sion in machine learning is semi-supervised learning. We run a popular semi- 
supervised learning algorithm by Zhu et al. GU over a couple of well-known 
datasets and compare the performance of the aforementioned approaches. Our 
experiments show the PIM performs the best. 


1 Introduction 

In this paper, we consider the following harmonic extension problem. Let X = {xi, • • • 
be a set of points in and i? be a subset of X. Given a function g over B, let 
Cg = {u : X —K|ub = g} be the set of functions on X whose restriction to B 
coincides with g. Denote = u(xi). The goal of the harmonic extension problem is 
to find the smoothest function in Cg. 

A commonly used approach is based on graph Laplacian g\M- Let Wij = 
exp(—) be the Gaussian weight between Xi,Xj for some parameter t. Con¬ 
sider the following quadratic energy functional over Cg. For any u S Cg E{u) = 
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Figure 1: One-dimensional examples 


i j Wij (ui — Uj)^. The small energy E{u) means the function u takes similar val¬ 
ues at nearby points, and the minimizer of this energy is considered as the smoothest 
function in Cg. It is not difficult to see that the minimizer u satisfies that £u = 0 on the 
points in X \ i? and ub = g- Here C is the weighted graph Laplacian given in matrix 
form as C = — W) where W = (tUy) is the weight matrix and V = diag(di) 

with di = Wij. We call u is discrete harmonic if £u = 0. The minimizer u can be 
computed by solving the linear system: 

r C{X\B,X)n = Q, 

\ u(Xi) = g(x,;), VXi e B 

where £(X \ X) is a submatrix of C by taking the rows corresponding to the subset 
X \ B. We call this approach of harmonic extension the graph Laplacian method 
(GLM). Note that the factor i in £ is immaterial in GLM but introduced to compare 
with other methods later. 

Consider the following simple example. Let X be the union of 198 randomly 
sampled points over the interval (0, 2) and B = {0,1, 2}. Set g = 0 at 0, 2 and g = 1 
at 1. We run the above graph Laplacian method over this example. Figure[T](a) shows 
the resulting minimizer. It is well-known that the harmonic function over the interval 
(0, 2) with the Dirichlet boundary g, in the classical sense, is a piece linear function, 
i.e., u{x) = X for X £ (0,1) and u{x) = 2 — x for x S (1, 2); Clearly, the function 
computed by GLM does not approximate the harmonic function in the classical sense. 
In particular, the Dirichlet boundary has not been enforced properly, and in fact the 
obtained function is not even continuous near the boundary. 

In this paper, we propose a new method which we call point integral method (PIM) 
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to compute harmonic extension. Figure [T](b) shows the harmonic function computed 
by PIM over the same data, which is a faithful approximation of the classical harmonic 
function. The point integral method is very simple and computes the harmonic exten¬ 
sion by solving the following modified linear system: 

£u + B)ub = B)g, (2) 

where yV(X, B) is a submatrix of the weight matrix W by taking the columns corre¬ 
sponding to the subset B. Here the parameter p is a fixed number whose choice will 
be described in Section|2 We consider the harmonic extension problem from the point 
of view of solving PDFs on manifolds, and derive the point integral method by ap¬ 
proximating the Laplace equation using an integral equation which is then discretized 
using points. Note that the Dirichlet boundary may not be exactly enforced in PIM. 
Nevertheless, when the points X and B uniform randomly sample a submanifold and 
its boundary respectively in the iid fashion, the harmonic function computed by PIM is 
guaranteed to converge to the one in the classical sense. See Theorem l2.2l 

We will show the derivation of the point integral method, and explain the reason 
that the graph Laplacian method fails to produce a faithful harmonic extension and 
propose two approaches to modify the GLM: one is to thicken the boundary and the 
other is to change weights. Figure[T](c) and (d) shows the resulting harmonic functions 
computed by the above approaches. We call the first approach the volume constraint 
method (VCM). The second approach requires an additional structure of meshes which 
are in general not available in machine learning problems. The main purpose that we 
discuss the second approach is to show that it is very subtle to choose the weights for 
the GLM to recover the classical harmonic extension. For the mathematical proof of the 
convergence of PIM and VCM, the interested readers are referred to the papers EHn. 

One important application of the harmonic extension in machine learning is semi- 
supervised learning ifTsll . We will perform the semi-supervised learning using the PIM 
and the VCM over a few well-known data sets, and compare their performance to GLM 
as well as the closely related method by Zhou et al. m. The experimental results show 
that the PIM have the best performance and both the PIM and the VCM out-perform 
the GLM and the method by Zhou et al. 

1.1 Related work 

The classical harmonic extension problem, also known as the Dirichlet problem for 
Laplace equation, has been studied by mathematicians for more than a century and has 
many applications in mathematics. The discrete harmonicity has also been extensively 
studied in the graph theory a. For instance, it is closely related to random walk and 
electric networks on graphs 131. In machine learning, the discrete harmonic extension 
and its variants have been used for semi-supervised learning MM- 

Much of research has been done on the convergence of the graph Laplacian. When 
there is no boundary, the pointwise convergence of the graph Laplacian to the manifold 
Laplacian was shown in 0117181121 , and the spectral convergence of the graph Laplacian 
was shown in ||2l. When there is boundary. Singer and Wu ifTSl and independently Shi 
and Sun ifTTl have shown that the spectra of the graph Laplacian converge to that of 
manifold Laplacian with Neumann boundary. 
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The discrete harmonic extension problem was studied in the community of numer¬ 
ical PDEs and its convergence result is well-known if the Laplacian matrix is derived 
based on finite difference provided that the points lie on a regular grid, or based on 
finite element provided that the points are the vertices of a well-shaped mesh tessellat- 
ing the domain. However, both assumptions on the points are difficult to be satisfied 
in the applications of machine learning. Du et al. 0 considered the nonlocal diffusion 
problems which is modeled by a similar integral equation. They observed that the reg¬ 
ularity of the boundary condition can not infer the regularity of the harmonic extension 
and thus proposed to thicken the boundary and employed volume constraint. 

2 Point Integral Method 

Let j\4 and dj\4 be a submanifold in and its boundary respectively. Given a smooth 
function g over dA4, the harmonic extension u of p in the classical sense is the solution 
to the following Laplace equation with the Dirichlet boundary: 

f -Amu(x) = 0, xG M 

I u{x) = g{x), X G dM 

where is the well-known Laplace-Beltrami operator. 

We observe that the Laplace equation is closely related to the following integral equa¬ 
tion. 


1 

t 



u(y))wt(x,y)dy - 2 



du(y) 

dn 


Wt(x,y)dTy 


= 0 , 


(4) 


I _ 

where Wt(x, y) = exp(— -A-AA) We will show the derivation of the integral equation 
later. Specifically, we proved the following theorem in IfTTI . 


Theorem 2.1 If u G be (2 harmonic function on Ad, i.e., Aj\/iu — 0, then we 

have for any x G M, 


1 

1 



u{y))wt{x,y)dy - 2 



dujy) 

dn 


Wt{x,y)dTy 


L'^(M) 


(5) 


Denote 


Ltu{x) = 7 / (u(x) 
I Jm 

- u(y))u;t(x,y)dy, and 

(6) 

jdu 

f du{y) 

L an 

(7) 


Notice that if the point set X = {xi, • • • , x„} samples the submanifold A4, then Ltu 
is discretized and well approximated by £u up to the volume weight where u = 
(u(xi), • • ■ , u(x„)). If we consider the Laplace equation with the Neumann boundary 
where ^ = ft, on dM. is given, we can approximate ^ (x) using X^bieB 'Wt{x, bi)ft(bi) 
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up to the surface area weight provided that the point set B = (bi, • • ■ , b^) C X 
samples dXi. Then we can discretize the integral equation (IHi using the linear system 


Cu-2 


n\dM\ 

m\M\ 


W(X,5)h = 0, 


( 8 ) 


where h = (/i(bi), • • • , /i(bm)). We now use the Robin boundary u + = g 

on dAi to bridge the Dirichlet boundary and the Neumann boundary as follows. For 
a small parameter fi, the above Robin boundary approximates the Dirichlet boundary 
u = g on dAi. At the same time, we can write the Neumann boundary ^ = ^{g — 
ub)- Therefore, the harmonic extension problem (O can be numerically solved by the 
following linear system 


2 n\dA\\ 
(3 m\A4\ 


W{X,B){g-UB) = 0 


(9) 


which is the same as the linear system (|2l) if set p = j ■ Indeed, we have proved 
the following theorem in ifTTll which bounds the difference between the harmonic func¬ 
tion u solving the classical harmonic extension problem (O and the harmonic extension 
u using PIM by solving the linear system (|9]l. 

Theorem 2.2 Let u be the solution to the problem Q and u solves the linear sys¬ 
tem (0. Assume X and B sample A4 and dAi uniform randomly in iid fashion re¬ 
spectively. Then there exist sequences of t(n, m) —>■ 0 and l3{t(n, m)) —> 0 so that in 
probability 


lim ||u -/(u)||i, 2 (^) = 0 

n,m—foo 


( 10 ) 


where I{u) is a function on A4 interpolating u defined as 


/(u)(x) 


Exj gx Wt (x, Xj )uj + tp ea Wt (x, Xj) (g^- - ) 

Ex, ex w^t(x,Xj) 


( 11 ) 


Although the convergence results hold only when the input point sets sample sub¬ 
manifolds, the point integral method can apply to any point sets in W^. In the examples 
shown in the paper, we take /? = 10"*^ , and thus p = 10^^. 

Integral Equation: We now derive the integral equation (|4|i. Here we assume Ad is an 
open set on For a general submanifold, the derivation follows from the same idea 
but is technically more involved. The interested readers are referred to d. Thinking 
of wt (x, y) as test functions, by integral by parts, we have 


[ AMWt(x,y)dy = - [ Vu • V'u;t(x, y)dy-f [ |^u;t(x,y)dTy 

J M J M JdM 

= 4 / (y-x)-VM(y)'u;t(x,y)dy-f [ |^Wt(x, y)#^) 

JM JdM 

The Taylor expansion of the function u tells us that 

u{y) - m(x) = (y - x) • Vu(y) - - x)^Hu(y)(y - x) -f 0(||y - x||3), (13) 
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where Hu(y) is the Hessian matrix of u at y. Note that ||y — x||"ryi(x, y)dy = 
We only need to estimate the following term. 


J7 [ (y-x)^H„(y)(y-x)M;t(x,y)dy 
J M 

= 77 / (y* -Xj)i9*jM(y)wt(x,y)dy 

J M 

= -7; I {y^ - Xi)diju{y)dj {wt{x, y)) dy 

^ Jm 

= 7 / dj{y^-x^)d^ju{y)wt{x,y)dy+ ^ {y, - Xi)dijju{y)wt{y,x)dy 

^ Jm ^Jm 

“o / (yt - x*)nj%u(y)w;t(x,y)dTy 
^ JdM 

= 7; I ^u{y)wt{x,y)dy-I- I {y^ - x,)njd,ju{y)wt{x,y)dTy + OitWA) 

^ JM ^ JdM 


Now consider the second summand in the last line is Although its Lad-M.) 

norm is of constant order, its L^{M) norm is of the order due to the fast decay 

of wt(x, y). Therefore, the integral equation (HI and Theorem 12.11 follow from the 
equations dni), d). 


3 Volume Constraint Method 

From Theorem 12. II we see that the discrete harmonicity £u = 0 is not necessary able 
to approximate the classical harmonicity Ax it = 0, as there is an extra term /tf^ 
of the integration over the boundary dA4 in the integral equation |4] This explains 
why the graph Laplacian method fails to approximate the harmonic extension in the 
classical sense. In this section, we will describe two approaches to modify the graph 
Laplacian method so that the discrete harmonicity £u = 0 approximates the classical 
harmonicity = 0. 

The first approach is based on the observation that the Gaussian weight wt (x, y) 
decays exponentially fast. Let Ad* = {x e Ad|(ix(x, 9Ad) > for any <5 > 0. 

For any points x G Ait and y G dAi, Wt(x, y) = o(t^) for any s and is very small 
for small t, and so is the term Therefore, for a point x G Ait, Ltu(x) = 0 

well approximates Axm(x) = 0. For the remaining points in the thickened boundary 
Ai \ A4t, since the harmonic function are smooth, we can approximate it(x) = g(x) 
for X G Ai\ Ait where x is the closest point to x on the boundary dAi. In the discrete 
setting, let Bt = {xi G X\xi G Ai \ Ait}. We discretize the harmonic extension 
problem using the following linear system. Denote x^ the closest point of x^ in B. 

r C{X\Bt,X)n = 0, 

\ u(Xi) = g(x,;), VXi e Bt 

This way of enforcing the Dirichlet condition by thickening the boundary also appeared 
in the paper 0 by Du et al, which they call the volume constraint. Du et al. consider 
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the problem of non local diffusion and the nonlocal operator in their setting takes the 
same form as (|6]l. We call this approach the volume constraint method (VCM). The 
following theorem has been proven in nni which guarantees the convergence of VCM. 


Theorem 3.1 Let u be the solution to the problem @ and u solves the linear sys¬ 
tem Assume X and B sample A4 and dA4 uniform randomly in iid fashion re¬ 
spectively. Then there exists a sequence oft(n, m) Q so that in probability 


lim ||u-/(u)||i 2 (;K^) =0 

n,m—>-oo 


where /(u) is a function on Mt interpolating u defined as 

ExjgxWi(x,Xj)Uj- 
ExjGXW^t(x,Xj) 


(16) 


(17) 


Similar to the PIM, although the convergence result for the VCM holds only when the 
input point sets sample submanifolds, the VCM can apply to any point sets in 

The second approach is to make the term vanish by choosing the weights 
so that W((x, y) = 0 for any y S dM. In one-dimensional case, we can use finite 
element method to obtain the weights. Sort the sample points Xi < ■ ■ ■ < Xn- We can 
set the weight function at x^ as the well-known hat function: Wt{x„x) = for 

X € (xi_i,Xi) and wt{xi,x) = for x G (x^jX^+i), which vanishes at the 

boundary. Figure [TJd) shows the resulting harmonic extension computed by solving 
the linear system ([T]i. In higher-dimensional case, if we are given a triangular mesh 
of the submanifold having the sample points X as vertices, we can again use finite 
element method to obtain the weights. We omit the detailed derivation of the weights 
in higher dimensional case. From the standard analysis of finite element method, with 
such weights, the solution computed by the linear system[T]of GLM approximates the 
classical harmonic extension @ with convergence guarantees. However, in the typical 
setting of machine learning or data analysis, this type of weights is very difficult, if not 
impossible, to obtain. Therefore, although it is popular in scientific computing, this 
approach is in general not applicable to machine learning problems. 


4 Semi-supervised Learning 

In this section, we briefly describe the algorithm of semi-supervised learning based 
on the harmonic extension proposed by Zhu et al. (na. We plug into the algorithm 
the aforementioned approaches of harmonic extension, and apply them to several well- 
known data sets, and compare their performance. 

Assume we are given a point set X = {xi, • • • , Xm, Xm-i-i, • ■ • , x„} C and a 
label set {1, 2 , • • • ,1}, and the label assignment on the first m points L : {xi, • • • , x™} — 
{ 1 , 2 ,-- - ,/}. Ina typical setting, m is much smaller than n. The purpose of the semi- 
supervised learning is to extend the label assignment L to the entire X, namely, infer 
the labels for the unlabeled points. 
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Think of the label points as the boundary B = {xi, • • • , x^}- For the label i, we 
set up the Dirichlet boundary g* as follows. If a point xj € B is labelled as i, set 
g‘(xj) = 1, and otherwise set g*(xj) = 0. Then we compute the harmonic extension 
u® of g® using the aforementioned approaches. In this way, we obtain a set of I har¬ 
monic functions u^, u^, • • • , We label Xj using k where k = argmaxi<i u®(xj). 
The algorithm is summarized in Algorithm[T] Note that this algorithm is slightly differ¬ 
ent from the original algorithm by Zhu et al. m where only one harmonic extension 
was computed by setting g®(xj) = fc if xj has a label k. 


Algorithm 1 Semi-Supervised Learning 

Input: A point set X = {xi, • ■ • , x^, x^+i, • • • , x„} C and a partial label as¬ 
signment L : {xi,-- - ,x„} ,1} 

Output: A complete label assignment L : X ^ {1,2, - ■ ■ ,1} 

1 : for i = 1 ■. I do 
2: for j = 1 : m do 

3: Set g®(xj) = 1 if L{xj) = i, and otherwise set g®(xj) = 0. 

4: end for 

5: Compute the harmonic extension u® of g®. 

6: end for 

7: for j = m -b 1 : n do 

8: i(xj) = k where k = argmaxi<; u®(xj). 

9: end for 


4.1 Experiments 

We now apply the above semi-supervised learning algorithm to a couple of well-known 
data sets: MNIST and 20 Newsgroups. We do not claim the state of the art performance 
on these datasets. The purpose of these experiments is to compare the performance of 
different approaches of harmonic extension. We also compare to the closely related 
method of local and global consistency by Zhou et al. m. 

MNIST : In this experiment, we use the MNIST of dataset of handwritten digits |[3l , 
which contains 60k 28 x 28 gray scale digit images with labels. We view digits 0 ^ 9 as 
ten classes. Each digit can be seen as a point in a common 784-dimensional Euclidean 
space. We randomly choose 16k images. Specifically, there are 1606, 1808, 1555, 
1663, 1552, 1416, 1590, 1692, 1521 and 1597 digits in 0 ~ 9 class respectively. 

To set the parameter t, we build a graph by connecting a point Xi to its 10 near¬ 
est neighbors under the standard Euclidean distance. We compute the average of the 
distances for Xi to its neighbors on the graph, denoted hi. Let h be the average of 
hi’s over all points and set t = h'^. The distance |xi — Xj | is computed as the graph 
distance between Xi and Xj. In the method of local and global consistency, we follow 
the paper Ql and set the width of the RBE kernel to be 0.3 and the parameter a in the 
iteration process to be 0.3. 

Eor a particular trial, we choose k {k = 1, 2, • • • , 10) images randomly from each 
class to assemble the labelled set B and assume all the other images are unlabelled. Eor 








(a) (b) 

Figure 2; (a) the error rates of digit recognition with a 16000-size subset of MNIST 
dataset; (b) the error rates of text classification with 20-newsgroups.rec(a 8014- 
dimensional space with 3970 data points). 


each fixed k, we do 100 trials. The error bar of the tests is presented in Figure|2](a). It 
is quite clear that the PIM has the best performance when there are more than 5 labelled 
points in each class, and the GLM has the worst performance. 

Newsgroup: In this experiment, we use the 20-newsgroups dataset, which is a classic 
dataset in text classification. We only choose the articles from topic rec containing four 
classes from the version 20-news-18828. We use Rainbow (version:20020213) to pre- 
process the dataset and finally vectorize them. The following command-line options 
are requirecQ (1)- -skip-header: to avoid lexing headers; {2)- -use-stemming: to mod¬ 
ify lexed words with the ‘Porter’ stemmer; (3)- -use-stoplist: to toss lexed words that 
appear in the SMART stoplist; (4)- -prune-vocab-by-doc-count=5: to remove words 
that occur in 5 or fewer documents; Then, we use TF-IDF algorithm to normalize the 
word count matrix. Finally, we obtain 3970 documents (990 from rec.autos, 994 from 
rec.motorcycles, 994 from rec.sport.baseball and 999 from rec.sport.hockey) and a list 
of 8014 words. Each document will be treated as a point in a 8014-dimensional space. 

To deal with text-kind data, we define a new distance introduced by Zhu et al. Il6l: 
the distance between Xi and Xj is d{xi,Xj) = 1 — cos a, where a is the angle between 
Xi and Xj in Euclidean space. Under this new distance, we ran the same experiment 
with the same parameter as we process the above MNIST dataset. The error bar of 
the tests for 20-newsgroups is presented in Eigure|2](b). A similar pattern result is ob¬ 
served, namely the PIM has the best performance when there are more than 2 labelled 
points in each class, and the GLM has the worst performance. 


5 Conclusion 

We have presented two new approaches for solving the harmonic extension problem. 
Both are simple but have theoretical guarantees. We have also compared their perfor¬ 
mance in the application of semi-supervised learning. In the future, we will test these 
methods on more datasets and find different applications of harmonic extension. 

* all the following options are offered by Rainbow 
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